Application of a Naïve Bayes Classifier to Assign Polyadenylation Sites from 3' End Deep Sequencing Data: A Dissertation
نویسندگان
چکیده
Cleavage and polyadenylation of a precursor mRNA is important for transcription termination, mRNA stability, and regulation of gene expression. This process is directed by a multitude of protein factors and cis elements in the pre-mRNA sequence surrounding the cleavage and polyadenylation site. Importantly, the location of the cleavage and polyadenylation site helps define the 3’ untranslated region of a transcript, which is important for regulation by microRNAs and RNA binding proteins. Additionally, these sites have generally been poorly annotated. To identify 3’ ends, many techniques utilize an oligo-dT primer to construct deep sequencing libraries. However, this approach can lead to identification of artifactual polyadenylation sites due to internal priming in homopolymeric stretches of adenines. Previously, simple heuristic filters relying on the number of adenines in the genomic sequence downstream of a putative polyadenylation site have been used to remove these sites of internal priming. However, these simple filters may not remove all sites of internal priming and may also exclude true polyadenylation sites. Therefore, I developed a naïve Bayes classifier to identify putative sites from oligo-dT primed 3’ end deep sequencing as true or false/internally primed. Notably, this algorithm uses a combination of sequence elements to distinguish between true and false sites. Finally, the resulting algorithm is highly
منابع مشابه
Accurate identification of polyadenylation sites from 30 end deep sequencing using a naı̈ve Bayes classifier
Motivation: 30 end processing is important for transcription termination, mRNA stability and regulation of gene expression. To identify 30 ends, most techniques use an oligo-dT primer to construct deep sequencing libraries. However, this approach can lead to identification of artifactual polyadenylation sites due to internal priming in homopolymeric stretches of adenines. Although heuristic fil...
متن کاملAccurate identification of polyadenylation sites from 3′ end deep sequencing using a naïve Bayes classifier
MOTIVATION 3' end processing is important for transcription termination, mRNA stability and regulation of gene expression. To identify 3' ends, most techniques use an oligo-dT primer to construct deep sequencing libraries. However, this approach can lead to identification of artifactual polyadenylation sites due to internal priming in homopolymeric stretches of adenines. Although heuristic filt...
متن کاملPolyA_DB 3 catalogs cleavage and polyadenylation sites identified by deep sequencing in multiple genomes
PolyA_DB is a database cataloging cleavage and polyadenylation sites (PASs) in several genomes. Previous versions were based mainly on expressed sequence tags (ESTs), which had a limited amount and could lead to inaccurate PAS identification due to the presence of internal A-rich sequences in transcripts. Here, we present an updated version of the database based solely on deep sequencing data. ...
متن کاملAnalysis of C. elegans intestinal gene expression and polyadenylation by fluorescence-activated nuclei sorting and 3′-end-seq
Despite the many advantages of Caenorhabditis elegans, biochemical approaches to study tissue-specific gene expression in post-embryonic stages are challenging. Here, we report a novel experimental approach for efficient determination of tissue-specific transcriptomes involving the rapid release and purification of nuclei from major tissues of post-embryonic animals by fluorescence-activated nu...
متن کاملProposed Techniques to Remove Flaming Problems from Social Networking Sites and outcome of Naïve Bayes Classifier for Detection of Flames
Natural Language Processing (NLP)[1][2][5] is a field of Computer Science concerned with the interactions between Computer and Human (Natural) Languages. Social Networking Sites are amongst the most effective communication tools now a days. But it also gave rise to the problem of flaming which is difficult to deal with. A flaming incident is triggered by comments and actions of users in SNS tha...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015